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Abstract 



We present a new way of converting a reversible finite Markov chain into a non- 
reversible one, with a theoretical guarantee that the asymptotic variance of the 
MCMC estimator based on the non-reversible chain is reduced. The method is 
applicable to any reversible chain whose states are not connected through a tree, 
and can be interpreted graphically as inserting vortices into the state transition 
graph. Our result confirms that non-reversible chains are fundamentally better 
than reversible ones in terms of asymptotic performance, and suggests interesting 
directions for further improving MCMC. 



1 Introduction 

Markov Chain Monte Carlo (MCMC) methods have gained enormous popularity over a wide variety 
of research fields J6j[8], owing to their ability to compute expectations with respect to complex, high 
dimensional probability distributions. An MCMC estimator can be based on any ergodic Markov 
chain with the distribution of interest as its stationary distribution. However, the choice of Markov 
chain greatly affects the performance of the estimator, in particular the accuracy achieved with a 
pre-specified number of samples 0). 

In general, the efficiency of an MCMC estimator is determined by two factors: i) how fast the 
chain converges to its stationary distribution, i.e., the mixing rate [9 1, and ii) once the chain reaches 
its stationary distribution, how much the estimates fluctuate based on trajectories of finite length, 
which is characterized by the asymptotic variance. In this paper, we consider the latter criteria. 
Previous theory concerned with reducing asymptotic variance has followed two main tracks. The 
first focuses on reversible chains, and is mostly based on the theorems of Peskun ifTUl and Tierney 
ifTTl . which state that if a reversible Markov chain is modified so that the probability of staying in 
the same state is reduced, then the asymptotic variance can be decreased. A number of methods 
have been proposed, particularly in the context of Metropolis-Hastings method, to encourage the 
Markov chain to move away from the current state, or its adjacency in the continuous case |[T2l[T3l . 
The second track, which was explored just recently, studies non-reversible chains. Neal proved 
in fl4) that starting from any finite-state reversible chain, the asymptotic variance of a related non- 
reversible chain, with reduced probability of back-tracking to the immediately previous state, will 
not increase, and typically decrease. Several methods have been proposed by Murray based on this 
idea0. 
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Neal's result suggests that non-reversible chains may be fundamentally better than reversible ones in 
terms of the asymptotic performance. In this paper, we follow up this idea by proposing a new way 
of converting reversible chains into non-reversible ones which, unlike in Neal's method, are defined 
on the state space of the reversible chain, with the theoretical guarantee that the asymptotic variance 
of the associated MCMC estimator is reduced. Our method is applicable to any non-reversible chain 
whose state transition graph contains loops, including those whose probability of staying in the 
same state is zero and thus cannot be improved using Peskun's theorem. The method also admits 
an interesting graphical interpretation which amounts to inserting 'vortices' into the state transition 
graph of the original chain. Our result suggests a new and interesting direction for improving the 
asymptotic performance of MCMC. 

The rest of the paper is organized as follows: section 2 reviews some background concepts and 
results; section 3 presents the main theoretical results, together with the graphical interpretation; 
section 4 provides a simple yet illustrative example and explains the intuition behind the results; 
section 5 concludes the paper. 

2 Preliminaries 

Suppose we wish to estimate the expectation of some real valued function / over domain S, with 
respect to a probability distribution n, whose value may only be known to a multiplicative constant. 
Let A be a transition operator of an ergodic^Markov chain with stationary distribution 7T, i.e., 

7r (x) A (x — >• y) = 7r (y) B (y — > x) , Vx, y £ S, (1) 

where B is the reverse operator as defined in [5 |. The expectation can then be estimated through the 
MCMC estimator 

^ = ^EL /(a;t) ' (2) 

where x\, ■ ■ ■ , xt is a trajectory sampled from the Markov chain. The asymptotic variance of /it, 
with respect to transition operator A and function / is defined as 

a 2 A (f) = lim TV \jj, t ] , (3) 

T— foo 

where V [ht] denotes the variance of [at- Since the chain is ergodic, a\ (/) is well-defined follow- 
ing the central limit theorem, and does not depend on the distribution of the initial point. Roughly 
speaking, asymptotic variance has the meaning that the mean square error of the estimates based on 
T consecutive states of the chain would be approximately hcr A (/), after a sufficiently long period 
of "burn in" such that the chain is close enough to its stationary distribution. Asymptotic variance 
can be used to compare the asymptotic performance of MCMC estimators based on different chains 
with the same stationary distribution, where smaller asymptotic variance indicates that, asymptoti- 
cally, the MCMC estimator requires fewer samples to reach a specified accuracy. 

Under the ergodic assumption, the asymptotic variance can be written as 

< (/) = V [/] + yT t=1 ( c ^/ m + C BJ M) . w 

where 

c AJ (r) = E A [/ (ar t ) / (x t+r )] - E A [/ (a*)] E [/ (x t+T )} 

is the covariance of the function value between two states that are r time steps apart in the trajectory 
of the Markov chain with transition operator A. Note that a\ (/) depends on both A and its reverse 
operator B, and <j\ (/) = cr^ (/) since A is also the reverse operator of B by definition. 

In this paper, we consider only the case where S is finite, i.e., S = {1, • • ■ , S}, so that the transition 
operators A and B, the stationary distribution it, and the function / can all be written in matrix form. 

Let 7T = [tt (1) , • • • , 7T (S)] T , / = [/ (1) , • ■ • , / (S)] T , Ai d = A (i -> j), B id = B (i -> j). The 
asymptotic variance can thus be written as 

< (/) = v [/] + /T ( QAT + QBT ~ 27r7fT ) 7 ' 

'Strictly speaking, the ergodic assumption is not necessary for the MCMC estimator to work, see f4l . 
However, we make the assumption to simplify the analysis. 
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with Q = diag{7r}. Since B is the reverse operator of A, QA = B T Q. Also, from the ergodic 
assumption, 

lim A T = lim B T = R, 

t— foo r— ►oo 

where R = 1tt t is a square matrix in which every row is ir T . It follows that the asymptotic variance 
can be represented by Kenney's formula [7] in the non-reversible case: 

< (/) = V [/] + 2 (Q/) T [A-] H (Qf) - 2f T Qf, (5) 

where [-} H denotes the Hermitian (symmetric) part of a matrix, and A = Q+irir T — J, with J = QA 
being the joint distribution of two consecutive states. 

3 Improving the asymptotic variance 

It is clear from Eq|5]that the transition operator A affects the asymptotic variance only through term 
[A - ]^. If the chain is reversible, then J is symmetric, so that A is also symmetric, and therefore 
comparing the asymptotic variance of two MCMC estimators becomes a matter of comparing their 
J, namely, i^j J < J' — QA', then a\ (/) < a A , (/), for any /. This leads to a simple proof of 
Peskun's theorem in the discrete case (3|. 

In the case where the Markov chain is non-reversible, i.e., J is asymmetric, the analysis becomes 



much more complicated. We start by providing a sufficient and necessary condition in section 3.1 
which transforms the comparison of asymptotic variance based on arbitrary finite Markov chains 
into a matrix ordering problem, using a result from matrix analysis. In section [3~2} a special case 
is identified, in which the asymptotic variance of a reversible chain is compared to that of a non- 
reversible one whose joint distribution over consecutive states is that of the reversible chain plus 
a skew-Hermitian matrix. We prove that the resulting non-reversible chain has smaller asymptotic 
variance, and provide a necessary and sufficient condition for the existence of such non-zero skew- 



Hermitian matrices. Finally in section 3.3 we provide a graphical interpretation of the result. 
3.1 The general case 

From Eq|5]we know that comparing the asymptotic variances of two MCMC estimators is equivalent 
to comparing their [A~] ff . The following result from QE] allows us to write [A - ] „ in terms of the 
symmetric and asymmetric parts of A. 

Lemma 1 If a matrix X is invertible, then [X~] H = [X] H + [X]^ [X] H [X] s , where [X] s is the 
skew Hermitian part of X. 

From Lemma [T] it follows immediately that in the discrete case, the comparison of MCMC esti- 
mators based on two Markov chains with the same stationary distribution can be cast as a different 
problem of matrix comparison, as stated in the following proposition. 

Proposition 1 Let A, A' be two transition operators of ergodic Markov chains with stationary dis- 
tribution it. Let J = QA, J' = QA', A = Q + irir T — J, A' = Q + mr T — J'. Then the following 
three conditions are equivalent: 

1) a 2 A {f)<a\,{f)foranyf 

2) [A~] H d [(A) 

3) [J] H - [J] T S [A] H [J] s < [J'] H - [J'Vs WYh [J% 

Proof. First we show that A is invertible. Following the steps in [3 1, for any / 7^ 0, 

/ T A/ = / T [A] H f = f (Q + tt^ t - J) / 



H 



(/ (x t ) - / {x t+l )f +E[f(x t )Y >0, 



i2 



2 For symmetric matrices X and Y, we write X < Y if Y — X is positive semi-definite, and X -< Y if 
Y — X is positive definite. 
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thus [A] H y 0, and A is invertible since A/ 7^ for any / ^ 0. 

Condition 1) and 2) are equivalent by definition. We now prove 2) is equivalent to 3). By Lemma[T] 

[A"] H ^ [(A')1 w [A] ff + [A] T s [A} H [A] s h [A'] H + [A'} I [A'] H [A'] s , 
the result follows by noticing that [A] H = Q + ttit t — [J] H and [A] s = — [J] s . ■ 

3.2 A special case 

Generally speaking, the conditions in Proposition [T] are very hard to verify, particularly because of 

the term [J] s [A] H [J) s - Here we focus on a special case where [J'] s = 0, and [J'] H = J = [J\h- 
This amounts to the case where the second chain is reversible, and its transition operator is the 
average of the transition operator of the first chain and the associated reverse operator. The result is 
formalized in the following corollary. 

Corollary 1 Let T be a reversible transition operator of a Markov chain with stationary distribution 
it. Assume there is some H that satisfies 

Condition I. 1 T H = 0, HI = 0, H = -H T , a«tQ 

Condition II. T ± Q" H are valid transition matrices. 

Denote A = T + Q~ H, B = T - Q~ H, then 

1) A preserves n, and B is the reverse operator of A. 

2) a 2 A (f) = a 2 B (f)< f j 2 T (f)foranyf. 

3) If H ^ 0, then there is some f, such that a\ (/) < a 2 -, (/). 

4) If A e = T + (1 + e) Q~ H is valid transition matrix, e > 0, then a\ (/) < a\ (/). 

Proof. For 1), notice that tt t T = tt t , so 

n T A = tt t T + tt t Q-H = tt t + 1 T H = tt t , 
and similarly for B. Moreover 

QA = QT + H = (QT - H) T = (Q (T - Qr H)) T = (QB) T , 
thus B is the reverse operator of A. 

For 2), o\ (/) = a% (/) follows from Eqj5] Let J' = QT, J = QA. Note that [J] s = H, 

J' = QT = i (QA + QB) = [QA] H = [J] H , 
and [A] H y thus H T [A] H H >z from Proposition [l] It follows that a\ (/) < o\ (/) for any /. 
For 3), write X = [A] H , 

[A~] H = (X + H T X~H)~ =X~ - X~H T (X + HX-H T Y HX~. 

Since X y 0, HX~ H T h 0, one can write (X + HX-H T )~ = J2s=i ^e s ej, with A s > 0, Vs. 
Thus 

H T (X + HX~H T y H = KHe s (He s ) T . 

Since H ^ 0, there is at least one s* , such that He s * ^ 0. Let / = Q~XHe 8 *, then 

\ [4 (/) - < (/)] = (Qf) T [X- -{X + H T X-Hy] (Qf) 

= (Qf) T X-H T (X + HX-H T y HX~ (Qf) 

= (He s *) T Y S X s He s (He s ) T (He s .) 

*• J S — l 

= A s \\He s , || 4 + V A s (eJ,H T He s ) 2 > 0. 



We write 1 for the S-dimensional column vector of l's. 
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For 4), let A £ = Q + tttt t - QA e , then for e > 0, 

[A-] H =(X + (1 + ef H T X-H) ~ < (X + H T X-H) ~ = [A"] H , 
by Eq|5] we have o\ e (f) < a\ (/) for any /. ■ 

Corollary [T] shows that starting from a reversible Markov chain, as long as one can find a non- 
zero H satisfying Conditions I and II, then the asymptotic performance of the MCMC estimator is 
guaranteed to improve. The next question to ask is whether such an H exists, and, if so, how to find 
one. We answer this question by first looking at Condition I. The following proposition shows that 
any H satisfying this condition can be constructed systematically. 

Proposition 2 Let H be an S-by-S matrix. H satisfies Condition I if and only if H can be written 
as the linear combination of h (5 — 1) (5 — 2) matrices, with each matrix of the form 

Ui.j = UiuJ — ujuj , 1 <i < j < S — 1. 

Here U\, • • • , us-i are S — 1 non-zero linearly independent vectors satisfying ujl = 0. 

Proof. Sufficiency. It is straightforward to verify that each Uij is skew-Hermitian and satisfies 
Uijl = 0. Such properties are inherited by any linear combination of Uij. 

Necessity. We show that there are at most i (5 — 1) (S — 2) linearly independent bases for all H 
such that H = —H T and HI = 0. On one hand, any S-by-S skew-Hermitian matrix can be written 
as the linear combination of (S — 1) matrices of the form 

V i,j : { V i,j} m ,n = 6 ( TO > *) 6 ( n 'j) ~ 6 ( n > 6 ( m '-?) ' 

where S is the standard delta function such that S = 1 if i = j and otherwise. However, 
the constraint HI = imposes 5—1 linearly independent constraints, which means that out of 
1 5 (5 — 1) parameters, only 

Is(S-l)-(S-l) = i(S-l)(S-2) 

are independent. 

On the other hand, selecting two non-identical vectors from u\, ■ ■ ■ ,Us-i results in 
\ (5 — 1) (5 — 2) different Uij. It has still to be shown that these Uij are linearly independent. 

Assume _ 

= ^ K i,jU itj = ^ K ij { u i u J ~ u j u J) . V K i,j £ R- 
l<i<j<S-l l<i<j<S-l 

Consider two cases: Firstly, assume u\, ■ ■ ■ , us-i are orthogonal, i.e., ujuj = for i ^ j. For a 
particular u s , 

= 2J K i,jU itJ U s = 2J K h] (uiUj - u j u J) u s 
l<i<j<S-l l<i<j<S-l 

= ^2 K ijS Ui\\uJu a \\ + K s jUj \\uju s \\ . 

l<i<s s<j<S-l 

Since uju s ^= 0, it follows that /«j iS = k s j = 0, for all 1 < i < s < j < 5 — 1. This holds for 
any u s , so all Kij must be 0, and therefore Uij are linearly independent by definition. Secondly, if 
iti, • • • , us-i are not orthogonal, one can construct a new set of orthogonal vectors ui, ■ ■ ■ , us-i 
from ui, • • • ,Us-i through Gram-Schmidt orthogonalization, and create a different set of bases 
Uij. It is easy to verify that each Uj is a linear combination of Uj. Since all Uj are linearly 
independent, it follows that Uij must also be linearly independent. ■ 

Proposition|2]confirms the existence of non-zero H satisfying Condition I. We now move to Condi- 
tion II, which requires that both QT + H and QT — H remain valid joint distribution matrices, i.e. 
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all entries must be non-negative and sum up to 1. Since 1 T (QT + H) 1 = 1 by Condition I, only 
the non-negative constraint needs to be considered. 

It turns out that not all reversible Markov chains admit a non-zero H satisfying both Condition I and 
II. For example, consider a Markov chain with only two states. It is impossible to find a non-zero 
skew-Hermitian H such that HI = 0, because all 2-by-2 skew-Hermitian matrices are proportional 



The next proposition gives the sufficient and necessary condition for the existence of a non-zero H 
satisfying both I and II. In particular, it shows an interesting link between the existence of such H 
and the connectivity of the states in the reversible chain. 

Proposition 3 Assume a reversible ergodic Markov chain with transition matrix T and let J = QT. 
The state transition graph Qt is defined as the undirected graph with node set S = {1, • • • , S} and 
edge set {(i, j) : Ji j > 0, 1 < i < j < S}. Then there exists some non-zero H satisfying Condition 
I and II, if and only if there is a loop in Qt- 



Proof. Sufficiency: Without loss of generality, assume the loop is made of states 1, 2, • ■ ■ ,N and 
edges (1, 2) , • • • , (N - 1, N) , (N, 1), with N > 3. By definition, J 1>N > 0, and J n . n+1 > for 
all 1 < ra < iV — 1. A non-zero H can then be constructed as 

e, if 1 < i < N - 1 and j = i + 1, 

-e, if 2 < % < N and j = i - 1, 

H id = { e, if i = TV and j = 1, 

-e, if i — 1 and j = N, 

0, otherwise. 



Here 



e— min { J n , n +1> 1 — Jn,n+1> Jl,N, 1 — Jl,NJ ■ 

Kn<JV-l 



Clearly, e > 0, since all the items in the minimum are above 0. It is trivial to verify that H = —H T 
and HI = 0. 

Necessity: Assume there are no loops in Qt, then all states in the chain must be organized in a tree, 
following the ergodic assumption. In other word, there are exactly 2 (S — 1) non-zero off-diagonal 
elements in J. Plus, these 2 (S — 1) elements are arranged symmetrically along the diagonal and 
spanning every column and row of J. 

Because the states are organized in a tree, there is at least one leaf node s in Qt, with a single neigh- 
bor s'. Row s and column s in J thus looks like r s = [■ ■ ■ ,p SiS , ■ • ■ ,Ps,s', ■ • • ] and its transpose, 
respectively, with p s 8 > and p s s r > 0, and all other entries being 0. 

Assume that one wants to construct a some H, such that J ± H > 0. Let h s be the s-th row of H. 
Since r s ± h s > 0, all except the s'-th elements in h s must be 0. But since h s l = 0, the whole s-th 
row, thus the s-th column of H must be 0. 

Having set the s-th column and row of H to 0, one can consider the reduced Markov chain with one 
state less, and repeat with another leaf node. Working progressively along the tree, it follows that all 
rows and columns in H must be 0. ■ 

The indication of Proposition[3]together with[2]is that all reversible chains can be improved in terms 
of asymptotic variance using Corollary[T] except those whose transition graphs are trees. In practice, 
the non-tree constraint is not a problem because almost all current methods of constructing reversible 
chains generate chains with loops. 



3.3 Graphical interpretation 

In this subsection we provide a graphical interpretation of the results in the previous sections. 
Starting from a simple case, consider a reversible Markov chain with three states forming a loop. 
Let u\ = [1,0,— 1] T and U2 = \0,1,— Tl . Clearly, u\ and u-i are linearly independent and 
ujl = ujl = 0. By Proposition]^] and pi there exists some e > 0, such that H = eU\2 satis- 



fy 




H = - eC/ 6 , 8 - eC/5,6 - eC/4,5 - elI 3A + eC/ 3>8 



Figure 1: Illustration of the construction of larger vortices. The left hand side is a state transition 
graph of a reversible Markov chain with 5 = 9 states, with a vortex 3^8^6— >5— >4of 
strength e inserted. The corresponding H can be expressed as the linear combination of Ui.j, as 
shown on the right hand side of the graph. We start from the vortex 8 -) 6 -) 9 4 8, and add 
one vortex a time. The dotted lines correspond to edges on which the flows cancel out when a new 
vortex is added. For example, when vortex 6-— s-5— s-9— !-6is added, edge 9 4 6 cancels edge 
6 — » 9 in the previous vortex, resulting in a larger vortex with four states. Note that in this way one 
can construct vortices which do not include state 9, although each Uij is a vortex involving 9. 



fies Condition I and II, with Uip = uiuj 



U211J . Write Ux,2 and J 



H in explicit form, 



-1 
1 




J + H 



Pi,i 

P2,l ~ £ 

P3,i + £ 



Pl,2 + £ 

P2.2 

P3,2 ~ £ 



Pi, 3 - 
P2,3 + 
P3,3 



with pi j being the probability of the consecutive states being i, j. It is clear that in J + H, the 
probability of jumps 1 —> 2, 2 —> 3, and 3 —> 1 is increased, and the probability of jumps in the 
opposite direction is decreased. Intuitively, this amounts to adding a 'vortex' of direction 1 — > 2 — > 
3 — » 1 in the state transition. Similarly, the joint probability matrix for the reverse operator is J — H, 
which adds a vortex in the opposite direction. This simple case also gives an explanation of why 
adding or subtracting non-zero H can only be done where a loop already exists, since the operation 
requires subtracting e from all entries in J corresponding to edges in the loop. 



In the general case, define S 



1 vectors U\, ■ 

, = [<>,•■■ ,0, 



• • ,«s-i as 

-th element 



It is straightforward to see that U\, ■ ■ ■ ,ug-i are linearly independent and ujl = for all s, thus 



any H satisfying Condition I can be represented as the linear combination of Uij — uiuj — UjuJ 
with each Ui 



■,j — 3 

, hJ containing l's at positions (j,S), (S, i), and — l's at positions (i,S), (S,j), 

(j, i). It is easy to verify that adding eC/jj to J amounts to introducing a vortex of direction i — » j — > 
S — >• i, and any vortex of N states (N > 3) si — > S2 — >• • • • — >• sjv — > si can be represented by the 
linear combination YlUZi U Sn , s „ +1 in the case of state S being in the vortex and assuming sjv = S 

without loss of generality, or U SN . Sl + ^2^=1 U Sn , Sn+1 if S is not in the vortex, as demonstrated in 
Figure [T] Therefore, adding or subtracting an H to J is equivalent to inserting a number of vortices 
into the state transition map. 



4 An example 

Adding vortices to the state transition graph forces the Markov chain to move in loops following 
pre-specified directions. The benefit of this can be illustrated in the following example. Consider a 
reversible Markov chain with S states forming a ring, namely from state s one can only jump to s© 1 
or s G 1, with © and G being the mod-5 summation and subtraction. The only possible non-zero H 
in this example is of form e 2 s =i U s , s +i> corresponding to vortices on the large ring. 

We assume uniform stationary distribution n (s) = In this case, any reversible chain behaves 
like a random walk. The chain which achieves minimal asymptotic variance is the one with the 
probability of both jumping forward and backward being |. The expected number of steps for 

this chain to reach the state € edges away is However, adding the vortex reduces this number to 
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Figure 2: Demonstration of the vortex effect: (a) and (b) show two different, reversible Markov 
chains, each containing 128 states connected in a ring. The equilibrium distribution of the chains is 
depicted by the gray inner circles; darker shades correspond to higher probability. The equilibrium 
distribution of chain (a) is uniform, while that of (b) contains two peaks half a ring apart. In addition, 
the chains are constructed such that the probability of staying in the same state is zero. In each 
case, two trajectories, of length 1000, are generated from the chain with and without the vortex, 
starting from the state pointed to by the arrow. The length of the bar radiating out from a given 
state represents the relative frequency of visits to that state, with red and blue bars corresponding 
to chains with and without vortex, respectively. It is clear from the graph that trajectories sampled 
from reversible chains spread much slower, with only 1 /5 of the states reached in (a) and 1/3 in (b), 
and the trajectory in (b) does not escape from the current peak. On the other hand, with vortices 
added, trajectories of the same length spread over all the states, and effectively explore both peaks 
of the stationary distribution in (b). The plot (c) show the correlation of function values (normalized 
by variance) between two states t time steps apart, with r ranging from 1 to 600. Here we take 
the Markov chains from (b) and use function / (s) = cos (An ■ jfg). When vortices are added, not 
only do the absolute values of the correlations go down significantly, but also their signs alternate, 
indicating that these correlations tend to cancel out in the sum of Eq(5] 

roughly ^ for large 5, suggesting that it is much easier for the non-reversible chain to reach faraway 
states, especially for large S. In the extreme case, when e — \, the chain cycles deterministically, 
reducing asymptotic variance to zero. Also note that the reversible chain here has zero probability 
of staying in the current state, thus cannot be further improved using Peskun's theorem. 

Our intuition about why adding vortices helps is that chains with vortices move faster than the 
reversible ones, making the function values of the trajectories less correlated. This effect is demon- 
strated in Figure [2] 

5 Conclusion 

In this paper, we have presented a new way of converting a reversible finite Markov chain into a non- 
reversible one, with the theoretical guarantee that the asymptotic variance of the MCMC estimator 
based on the non-reversible chain is reduced. The method is applicable to any reversible chain whose 
states are not connected through a tree, and can be interpreted graphically as inserting vortices into 
the state transition graph. 

The results confirm that non-reversible chains are fundamentally better than reversible ones. The 
general framework of Proposition [T] suggests further improvements of MCMC's asymptotic per- 
formance, by applying other results from matrix analysis to asymptotic variance reduction. The 
combined results of Corollary [T] and Propositions [2] and [3] provide a specific way of doing so, and 
pose interesting research questions. Which combinations of vortices yield optimal improvements 
for a given chain? Finding one of them is a combinatorial optimization problem. How can a good 
combination be constructed in practice, using limited history and computational resources? 
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